29 research outputs found

    Fast and Robust Object Detection Using Visual Subcategories

    Full text link
    Object classes generally contain large intra-class varia-tion, which poses a challenge to object detection schemes. In this work, we study visual subcategorization as a means of capturing appearance variation. First, training data is clustered using color and gradient features. Second, the clustering is used to learn an ensemble of models that cap-ture visual variation due to varying orientation, truncation, and occlusion degree. Fast object detection is achieved with integral image features and pixel lookup features. The framework is studied in the context of vehicle detection on the challenging KITTI dataset. 1

    Beyond just keeping hands on the wheel: Towards visual interpretation of driver hand motion patterns

    Full text link
    Abstract β€” Observing hand activity in the car provides a rich set of patterns relating to vehicle maneuvering, secondary tasks, driver distraction, and driver intent inference. This work strives to develop a vision-based framework for analyzing such patterns in real-time. First, hands are detected and tracked from a monocular camera. This provides position information of the left and right hands with no intrusion over long, naturalistic drives. Second, the motion trajectories are studied in settings of activity recognition, prediction, and higher-level semantic categorization. I

    Learning to Drive Anywhere

    Full text link
    Human drivers can seamlessly adapt their driving decisions across geographical locations with diverse conditions and rules of the road, e.g., left vs. right-hand traffic. In contrast, existing models for autonomous driving have been thus far only deployed within restricted operational domains, i.e., without accounting for varying driving behaviors across locations or model scalability. In this work, we propose AnyD, a single geographically-aware conditional imitation learning (CIL) model that can efficiently learn from heterogeneous and globally distributed data with dynamic environmental, traffic, and social characteristics. Our key insight is to introduce a high-capacity geo-location-based channel attention mechanism that effectively adapts to local nuances while also flexibly modeling similarities among regions in a data-driven manner. By optimizing a contrastive imitation objective, our proposed approach can efficiently scale across inherently imbalanced data distributions and location-dependent events. We demonstrate the benefits of our AnyD agent across multiple datasets, cities, and scalable deployment paradigms, i.e., centralized, semi-supervised, and distributed agent training. Specifically, AnyD outperforms CIL baselines by over 14% in open-loop evaluation and 30% in closed-loop testing on CARLA.Comment: Conference on Robot Learning (CoRL) 202

    Integrating motion and appearance for overtaking vehicle detection

    Full text link
    Abstract β€” The dynamic appearance of vehicles as they enter and exit a scene makes vehicle detection a difficult and compli-cated problem. Appearance based detectors generally provide good results when vehicles are in clear view, but have trouble in the scenes edges due to changes in the vehicles aspect ratio and partial occlusions. To compensate for some of these deficiencies, we propose incorporating motion cues from the scene. In this work, we focus on a overtaking vehicle detection in a freeway setting with front and rear facing monocular cameras. Motion cues are extracted from the scene, and leveraging the epipolar geometry of the monocular setup, motion compensation is performed. Spectral clustering is used to group similar motion vectors together, and after post-processing, vehicle detections candidates are produced. Finally, these candidates are combined with an appearance detector to remove any false positives, outputting the detections as a vehicle travels through the scene. I
    corecore